Subject Classification in the Oxford English Dictionary

نویسندگان

  • Zarrin Langari
  • Frank Wm. Tompa
چکیده

The Oxford English Dictionary is a valuable source of lexical information and a rich testing ground for mining highly structured text. Each entry is organized into a hierarchy of senses, which include definitions, labels and cited quotations. Subject labels distinguish the subject classification of a sense, for example they signal how a word may be used in Anthropology, Music or Computing. Unfortunately subject labeling in the dictionary is incomplete. To overcome this incompleteness, we attempt to classify the senses (i.e., definitions) in the dictionary by their subjects, using the citations as an information guide. We report on four different approaches: Nearest Neighbors, a standard classification technique; Term Weighting, an information retrieval method dealing with text; Naive Bayes, a probabilistic method; and Expectation Maximization, an iterative probabilistic method. Experimental performance of these methods is compared based on standard classification metrics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Investigation into Bilingual Dictionary Use: Do the Frequency of Use and Type of Dictionary Make a Difference in L2 Writing Performance?

Bilingual dictionary use in L2 writing test performance has recently been the subject of debate. Opinions differ according to how the trait is understood and whether the system favors the process-oriented or product-oriented views towards the assessment and writing skill. Given the need for more empirical support, this study is aimed at investigating the availability of bilingual dictionary use...

متن کامل

Sentiment Analysis of Social Networking Data Using Categorized Dictionary

Sentiment analysis is the process of analyzing a person’s perception or belief about a particular subject matter. However, finding correct opinion or interest from multi-facet sentiment data is a tedious task. In this paper, a method to improve the sentiment accuracy by utilizing the concept of categorized dictionary for sentiment classification and analysis is proposed.  A categorized dictiona...

متن کامل

A Freely Available Syntactic Lexicon for English

This paper presents a syntactic lexicon for English that was originally derived from the Oxford Advanced Learner’s Dictionary and the Oxford Dictionary of Current Idiomatic English, and then modified and augmented by hand. There are more than 37,000 syntactic entries from all 8 parts of speech. An X-windows based tool is available for maintaining the lexicon and performing searches. C and Lisp ...

متن کامل

Oxford Dictionary of English - current developments

This research note describes the early stages of a project to enhance a monolingual English dictionary database as a resource for computational applications. It considers some of the issues involved in deriving formal lexical data from a natural-language dictionary.

متن کامل

EFL Translation Students' Perspective toward Using Bilingual Dictionary in Translation of Polysemous Words

This research presented the use of bilingual dictionary and addressed the EFL translation students' points of view on the use of bilingual dictionary in translating polysemous words (English to Persian). Moreo- ver, it aimed at finding the possible relationship between the effect of using bilingual dictionary by stu- dents in translating polysemous words and their achieved scores. In the study ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001